The method is trained on the data that were available, but it is meant to be re-trainable as soon as new data are published. It would be great to be really sure that even someone else will be able to do it. In case we receive any feedback, we would be really happy to improve our Github repository so as to make the reproduction easier!
This paper presents a fine example of high-throughput computational materials screening studies, mainly focusing on the carbon nanoclusters of different sizes. In the paper, a set of diverse empirical and machine-learned interatomic potentials, which are commonly used to simulate carbonaceous materials, is benchmarked against the higher-level density functional theory (DFT) data, using a range of diverse structural features as the comparison criteria. Trying to reproduce the data presented here (even if you only consider a subset of the interaction potentials) will help you devise an understanding as to how you could approach a high-throughput structure prediction problem. Even though we concentrate here on isolated/finite nanoclusters, AIRSS (and other similar approaches like USPEX, CALYPSO, GMIN, etc.,) can also be used to predict crystal structures of different class of materials with applications in energy storage, catalysis, hydrogen storage, and so on.
The results of this paper have been used in multiple subsequent studies as a benchmark against which other methods of performing the same calculation have been tested. Other groups have challenged the results as suffering from finite size effects, in particular the calculations on mixtures of cubic and hexagonal ice. Should there be time during in the event, participants could check this by performing calculations on larger unit cells. Each individual calculation should converge adequately within 96 hours making it amenable to a HPC ReproHack. Given modern HPC hardware many such calculations could be run concurrently on a single HPC node.
- This paper is a good example of a standard social science study that is (I hope!) fully reproducible, from main analysis, to supplementary analyses and figures. - I have not yet received any external feedback w.r.t. its reproducibility, so would be interested to see if I have overlooked any gaps in the reproduction workflow that I anticipated.
The results of the individual studies (4) could be interpreted in support for the hypothesis, but the meta-analysis suggested that implicit identification was not a useful predictor overall. This conclusion is an important goalpost for future work.